Implication of genetic variants in primary microRNA processing sites in the risk of multiple sclerosis

Summary Background Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system with a well-established genetic contribution to susceptibility. Over 200 genetic regions have been linked to the inherited risk of developing MS, but the disease-causing variants and their functional effects at the molecular level are still largely unresolved. We hypothesised that MS-associated single-nucleotide polymorphisms (SNPs) affect the recognition and enzymatic cleavage of primary microRNAs (pri-miRNAs). Methods Our study focused on 11 pri-miRNAs (9 primate-specific) that are encoded in genetic risk loci for MS. The levels of mature miRNAs and potential isoforms (isomiRs) produced from those pri-miRNAs were measured in B cells obtained from the peripheral blood of 63 MS patients and 28 healthy controls. We tested for associations between SNP genotypes and miRNA expression in cis using quantitative trait locus (cis-miR-eQTL) analyses. Genetic effects on miRNA stem-loop processing efficiency were verified using luciferase reporter assays. Potential direct miRNA target genes were identified by transcriptome profiling and computational binding site assessment. Findings Mature miRNAs and isomiRs from hsa-mir-26a-2, hsa-mir-199a-1, hsa-mir-4304, hsa-mir-4423, hsa-mir-4464 and hsa-mir-4492 could be detected in all B-cell samples. When MS patient subgroups were compared with healthy controls, a significant differential expression was observed for miRNAs from the 5’ and 3’ strands of hsa-mir-26a-2 and hsa-mir-199a-1. The cis-miR-eQTL analyses and reporter assays pointed to a slightly more efficient Drosha-mediated processing of hsa-mir-199a-1 when the MS risk allele T of SNP rs1005039 is present. On the other hand, the MS risk allele A of SNP rs817478, which substitutes the first C in a CNNC sequence motif, was found to cause a markedly lower efficiency in the processing of hsa-mir-4423. Overexpression of hsa-mir-199a-1 inhibited the expression of 60 protein-coding genes, including IRAK2, MIF, TNFRSF12A and TRAF1. The only target gene identified for hsa-mir-4423 was TMEM47. Interpretation We found that MS-associated SNPs in sequence determinants of pri-miRNA processing can affect the expression of mature miRNAs. Our findings complement the existing literature on the dysregulation of miRNAs in MS. Further studies on the maturation and function of miRNAs in different cell types and tissues may help to gain a more detailed functional understanding of the genetic basis of MS. Funding This study was funded by the Rostock University Medical Center (FORUN program, grant: 889002), Sanofi Genzyme (grant: GZ-2016-11560) and Merck Serono GmbH (Darmstadt, Germany, an affiliate of Merck KGaA, CrossRef Funder ID: 10.13039/100009945, grant: 4501860307). NB was supported by the Stiftung der Deutschen Wirtschaft (sdw) and the FAZIT foundation. EP was supported by the Landesgraduiertenförderung Mecklenburg-Vorpommern.


Introduction
Multiple sclerosis (MS) is a chronic, inflammatory and neurodegenerative disease of the central nervous system (CNS). 1 The pathological hallmark of MS is the formation of demyelinating lesions in the brain and spinal cord, which are caused by the infiltration of immune cells (e.g., T cells and B cells). 2 Clonal expansion of B cells in MS CNS usually results in intrathecal antibody production. 3 The clinical presentation and course of MS are heterogeneous. 4 Symptoms of MS include mobility restrictions (oftentimes due to paresis and spasticity), visual impairment, bladder and bowel problems, pain, depression, cognitive deficits and fatigue. 5 Different courses of MS are distinguished 6 : relapsing-remitting MS (RRMS), secondary progressive MS (SPMS) and primary progressive MS (PPMS). In the early phase of the disease, 85À90% of the patients have RRMS with reversible episodes of neurological deficits (known as relapses) lasting several days or weeks. Over time, RRMS patients tend to stop having relapses but continue to develop permanent neurological deficits, which is referred to as the SPMS stage. 7 A minority of patients (10À15%) have PPMS, with progression of disability without acute relapses from disease onset. RRMS is typically diagnosed in young adults between 20 years and 35 years of age, whereas PPMS usually begins at »40 years of age. 1,8 Moreover, it has been shown that the female-to-male ratio is 2.5:1 for patients with RRMS/SPMS and 1.2:1 for patients with PPMS. 9 There is still no curative therapy for MS, but a better understanding of the immunological and neurobiological processes underlying MS has led to the development of treatments that can significantly reduce disease activity. 1 For instance, monthly infusions of the monoclonal antibody natalizumab, which inhibits the transendothelial migration of lymphocytes into the CNS by blocking their attachment to cell adhesion proteins, proved highly effective in preventing relapses and slowing progression of disability. 10,11 The newer so-called immune reconstitution therapies (IRTs) even have the potential to induce long-term drug-free remission in many patients. 12 With IRTs, components of the adaptive immune system are depleted to allow for renewal. Typical representatives of IRTs are alemtuzumab and cladribine, which are administered in annual treatment courses. 13 A common mechanism of action is the preferential depletion of memory B cells. 14 Following depletion, the circulating CD19 + B cells repopulate within 6-9 months [14][15][16]. B cells are key players as they mediate cytokine production, antigen presentation, antibody synthesis and the formation of ectopic follicles at sites of inflammation. 3 However, the reader is referred to the literature for a more comprehensive overview on the broad spectrum of disease-modifying treatments available for MS 17À19 and on the roles of B cells in MS. 20,21 The exact causes of MS are still unknown, but it is well established that the individual risk of developing MS is determined by interactions of genetic and nongenetic factors. Environmental and lifestyle factors that have been found to be associated with MS risk include Epstein-Barr virus (EBV) infection, cigarette smoking, adolescent obesity and vitamin D deficiency. 22,23 The strongest genetic susceptibility variants for MS are located within the human leukocyte antigen (HLA) region. In particular, the HLA-DRB1*15:01 allele confers an increased risk of MS with an odds ratio (OR) of 3.92, whereas the HLA-A*02:01 allele has protective effects (OR=0. 67). 24 In addition, more than 200 non-HLA MS risk loci with ORs ranging from 1.06 to 2.06 have been identified in large-scale genetic studies with more than 100,000 subjects of primarily European ancestry. 25,26 Cell type-specific enrichments of MS genetic signals have pointed to prominent roles of microglia 26 as well as T h 17 cells and memory B cells 27 in the pathogenesis of MS. However, the disease-causative variants remain largely unclear, as different single-nucleotide polymorphisms (SNPs) in linkage disequilibrium (LD) can explain the association with MS. 28 Most disease-associated SNPs are expected to capture regulatory effects, which means that they are colocalised with quantitative trait loci (QTLs), e.g., for gene expression (eQTLs) and splicing (sQTLs). 29 In contrast to extensive investigations of mRNA eQTLs and sQTLs across many human tissues and cell types, 30À32 there are still relatively few studies on microRNA (miRNA) expression QTLs (miR-eQTLs). 33,34 MicroRNAs are small non-coding RNAs about 22 nt in length that act as post-transcriptional regulators of gene expression. 35 A total of 2,883 human mature miR-NAs are currently listed in the miRBase database release 22. 36 Approximately half of all known miRNAs originate from intragenic regions. 37 However, the expression of miRNA and host gene can be uncoupled. 38 Mature miRNAs are processed from stem-loop regions of longer primary miRNA (pri-miRNA) transcripts in distinct maturation steps 39 : A key step is the cleavage of the miRNA stem-loop by Microprocessor, 40,41 a nuclear complex of the RNase Drosha and its essential cofactor DGCR8. This leads to a small hairpin-shaped precursor miRNA, which is exported into the cytoplasm where it is cleaved by the RNase Dicer. One strand of the resulting miRNA duplex is loaded into the so-called RNAinduced silencing complex, which mediates the decay and/or translational repression of target transcripts that contain a binding site for the mature miRNA. Alternative cleavage sites and post-maturation sequence modifications can give rise to isoforms of miRNAs (isomiRs) that differ in sequence from the canonical mature miRNA. 42 Sequence motifs in the miRNA stem-loop region are known to influence the processing by Microprocessor. These are a UG motif at position »À14/À13 from the 5 0 Drosha cleavage site, a GUG/UGU motif in the apical loop and a »16À21 nt downstream CNNC motif, 43 which is bound by the proteins DDX17 and SRSF3. 44,45 Roughly 80% of all conserved human miRNA sites contain at least one of these motifs. 44 Further determinants of an efficient miRNA processing are a GNNU motif and a mismatched GHG motif in the basal stem region and an extensive base pairing of the »35-bp stem. 41,46 However, yet unknown sequence motifs and accessory proteins are thought to contribute to the Drosha-mediated cleavage. 47 SNPs that are located in features required for the recognition of pri-miRNAs can affect their processing. First studies could show a dependence of the expression of miRNAs on the genotype of MSassociated SNPs in cis: For carriers of the risk alleles of the SNPs rs2910164 (C) and rs1414273 (C), an increased expression of hsa-miR-146a-5p and hsa-miR-548ac-3p, respectively, was described in blood cells. 48À50 In this study, we conducted a more systematic investigation of miRNAs derived from genetic loci associated with MS risk. To this end, we measured the levels of prioritised mature miRNAs and isomiRs in B cells from MS patients (n=63) and healthy controls (n=28) and analysed whether their expression is subject to cis-miR-eQTL effects. We then used luciferase reporter assays to verify whether the enzymatic processing of selected miRNA stem-loops is indeed causally related to the genotype of the proximal SNPs. Moreover, we aimed to identify new target genes that are post-transcriptionally regulated by the miRNAs under scrutiny. Our study provides insights into the molecular mechanisms underlying the pathogenesis of MS by uncovering the putative functional role of genetic variants implicated in disease susceptibility.

Selection of microRNAs from MS-associated genetic regions
A database-driven approach was employed to prioritise MS-associated genetic variants in potential pri-miRNA processing sites ( Figure 1). First, we identified all human miRNA stem-loops that are recorded in the miRBase database (release 21) 36 and that are encoded within a distance of <250 kb to the 233 MS-associated lead SNPs, which have been reported in the most recent genome-wide association study (GWAS) in MS. 26 Second, we used the LDproxy and LDpair modules of the web-based tool LDlink (release 3.0) 51 to check whether the miRNA loci contain SNPs (hereinafter referred to as MIR SNPs) that are in LD with the MS lead SNPs. More specifically, we excluded miRNA-encoding regions for which all proximal SNPs had an r 2 <0.1 in the European subpopulation of the 1000 Genomes panel, 52 and we selected genetic variants with D'>0.7 that are located within a miRBase-annotated stem-loop sequence or 50 bp upstream or downstream of the putative Drosha cleavage sites and that had a minor allele frequency (MAF) >1% according to dbSNP version 150. 53 If the MIR SNPs were not included in the LDlink tool, adjacent SNPs were considered as proxy. Finally, we excluded SNPs that presumably do not affect the canonical pri-miRNA processing by Drosha (e.g., SNPs in the vicinity of mirtrons 39 ).
The prioritised pri-miRNAs may give rise to different miRNA isoforms that differ at the 5 0 or 3 0 end from the miRBase-annotated mature miRNA forms. Mature This study comprised three parts: First, primary miRNAs encoded in genetic risk loci for MS were identified by a database-driven approach. Second, DNA, B cells and B-cell RNA were collected from MS patients as well as from healthy controls. Flow cytometry was used to phenotype B-cell subpopulations. The expression levels of mature miRNAs and isomiRs were measured using specific qPCR assays. The subjects were genotyped with respect to MS-associated SNPs within the miRNA-coding regions. Third, the effect of the SNPs on the enzymatic cleavage of the primary miRNAs by the Microprocessor complex was investigated using luciferase reporter assays. Transcriptome profiling was used to identify miRNA target genes. MIR=microRNA, MS=multiple sclerosis, qPCR=quantitative polymerase chain reaction, SNP=single-nucleotide polymorphism.
Articles miRNAs may also originate from both arms of a precursor miRNA, even if not explicitly recorded in the miRBase database. 36 Therefore, we extracted potential isomiRs from the IsomiR Bank database 54 and ordered them by the cumulative reads per million (RPM) values. Additionally, we inspected the normalised miRNA expression data pileup plots provided in the miRCarta database version 1.1. 55 Based on this information, we derived up to 2 potential isomiRs with a length of >16 bases for each arm of each precursor miRNA. The isomiRs were named following the nomenclature by Cloonan et al. 56 The web-based tools RNAfold 57 and forna 58 were used to visualise the predicted minimum free energy (mfe) secondary structures of selected miRNA stemloop sequences.

Study groups and samples
We were interested in examining whether the selected miRNAs are dysregulated in B cells of patients with MS and whether their levels are related to the genotype of MS-associated SNPs. For this purpose, 20 ml of peripheral blood were taken from MS patients during regular clinical visits as well as from healthy donors at the Rostock University Medical Center between July 2015 and December 2019. Individuals younger than 18 years of age were excluded from this observational study. The diagnosis of MS has been confirmed according to the revised McDonald criteria. 59 Neurological impairment was rated using the Expanded Disability Status Scale (EDSS). 60 Routine medical care was provided to all patients. Hence, the patients were treated and monitored according to the guidelines and recommendations of the German Society of Neurology.
The blood samples were collected from a peripheral vein into tubes with ethylenediaminetetraacetic acid (EDTA) as anticoagulant. Immediately following the blood withdrawal, each sample was processed to obtain DNA and B cells (Figure 1). DNA was purified from 200 µl of whole blood using the QIAamp DNA Blood Mini Kit (Qiagen) and stored at -20°C. Peripheral blood mononuclear cells (PBMC) were isolated using Ficoll density gradient separation (Histopaque-1077, Sigma-Aldrich). Untouched B cells were then enriched from the PBMC by negative selection using the Pan B Cell Isolation Kit (Miltenyi Biotec). Accordingly, non-B cells were magnetically labelled with a cocktail of biotin-conjugated monoclonal antibodies and anti-biotin MicroBeads and then removed by magnetic separation following manufacturer's instructions. The number of isolated B cells was counted under a microscope using Neubauer chambers. From each sample, two aliquots of 100,000 B cells were cryopreserved in freezing medium (Biofreeze, Biochrom) in the vapor phase of liquid nitrogen at below -140°C. From the remaining B cells, total RNA was preserved using the miRNeasy Mini Kit (Qiagen) in combination with the RNase-free DNase set (Qiagen). The RNA was stored at -80°C.

Flow cytometry
The purities of freshly isolated B cells were assessed by flow cytometry using BD FACSCalibur and BD FACSAria instruments and CD19 or CD20 antibodies conjugated to fluorescein isothiocyanate (FITC) or phycoerythrin (PE) (RRIDs: AB_2726271, AB_2726142 and AB_2656064) (Miltenyi Biotec). The acquired data were analysed using Flowing Software version 2.5.1 (Turku Bioscience).
To characterise the samples in more depth, we aimed to acquire information on the composition of the peripheral B cells from the healthy subjects and the patients with MS. The B-cell phenotyping was done with the cryopreserved B cells using a BD FACSAria Illu system. For this purpose, the B cells were quickly thawed in a 37°C water bath, washed with phosphate buffered saline and then stained with the fluorochrome- The data were further processed in the FlowJo software version 10 from BD. The FlowJo plugin CytoNorm version 0.9 61 was applied for normalising batch effects based on the data of the internal controls. The plugin FlowAI version 2.1 62 was applied for checking the data quality (e.g., the steadiness of the flow rate) and for detecting and removing outlier events. The gating strategy for the identification of B cell subsets was adapted from Morbach et al. 63 First, live single CD19 + cells were determined based on their forward and side scatter properties and the signals for the live/dead and CD19 markers. We then gated on CD27 À IgD + naive B cells, CD27 + IgD + nonswitched memory B cells, CD27 + IgD À switched memory B cells, CD27 À IgD À memory B cells, CD24 ++ CD38 ++ transitional B cells, CD24 À CD38 ++ plasmablasts, CD21 À/low-CD38 À/low B cells and CD23 + B cells, 64 as shown in Figure   S1. The number of gated events was used to determine the respective B-cell subpopulation frequencies.

Measurement of mature microRNAs in B cells
The integrity of the B-cell RNA samples was checked using an Agilent Bioanalyzer 2100 with RNA 6000 Pico kits (Agilent Technologies). RNA concentrations were measured using a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific).
After the RNA quality check, the levels of mature miRNAs and miRNA isoforms were measured by quantitative polymerase chain reaction (qPCR) using custom Specialty TaqMan Array Cards (48 format) for standard TaqMan miRNA assays with FAM-labelled probes (Thermo Fisher Scientific). Reverse transcription (RT) was performed using 35 ng of RNA per sample and a custom target-specific stem-loop RT primer pool. The cDNA was then preamplified in 12 PCR cycles with custom PreAmp primers following the manufacturer's protocol for the use of Megaplex primer pools (Applied Biosystems). Finally, the qPCR measurements were prepared using TaqMan Universal Master Mix II with UNG and performed using a ViiA 7 Real-Time PCR System (Applied Biosystems) with 45 amplification cycles. Each miRNA was measured either in single wells or in technical triplicates, and hsa-miR-191-5p was included in the analysis as reference miRNA. 65 Raw C T values were determined using the Expres-sionSuite software version 1.3 (Thermo Fisher Scientific) and averaged over replicates. Assays for which the automatic threshold algorithm failed were excluded from the further analyses. Missing values, which resulted when no C T value could be determined within 45 PCR cycles, were imputed using the expectationmaximisation algorithm that is implemented in the R package nondetects. 66 For sensitivity analyses, the imputed dataset was truncated to a maximum C T value of 35. The data were normalised to the reference miRNA (DC T method) and converted to the linear scale using the equation 2 ÀDCT £1,000.

Genotyping
The DNA extracted from the blood samples was used to determine the genotypes of MIR SNPs and tagging SNPs for HLA-A*02:01 67 and HLA-DRB1*15:01 68 for each subject. The genotyping was done by PCR with allele-specific TaqMan probes labelled with FAM or VIC on custom Specialty TaqMan Array Cards (Thermo Fisher Scientific). For each reaction, we used »2 ng of DNA that was diluted in TaqMan Universal Master Mix II with UNG (Applied Biosystems). PCR amplification was carried out following manufacturer's instructions in a ViiA 7 Real-Time PCR System (Applied Biosystems). Genotype calling was performed by allelic discrimination with the TaqMan Genotyper Software version 1.6 (Applied Biosystems).

Evaluation of primary microRNA processing efficiencies
We employed a luciferase reporter system to evaluate whether the pri-miRNA processing efficiency depends on the genotype of MS-associated SNPs ( Figure 1). The assay has been adapted from Allegra et al. 69,70 and is based on a luciferase reporter gene that is fused with a miRNA precursor sequence. We first tested the assay with the pri-miRNA hsa-mir-16-1 because its processing has already been extensively studied. 44,71,72 We utilised the luciferase reporter vector CmiT000001-MT05 (GeneCopoeia), which encodes the secreted Gaussia luciferase (GLuc) as reporter gene and the secreted alkaline phosphatase (SEAP) as internal control for normalisation. In addition, we purchased miExpress precursor miRNA expression plasmids for the prioritised miRNAs as well as for hsa-mir-16-1 from GeneCopoeia. The plasmids were amplified in E. coli cultures and purified using the Qiagen EndoFree Plasmid Maxi Kit and QIAfilter Midi Cartridges.
Mutations resembling the naturally occurring genetic variants in the miRNA-coding sequence region (i.e., MIR SNPs) were introduced into the precursor miRNA plasmids. In the hsa-mir-16-1 precursor plasmid, an artificial C>T mutation at position 3' (+19) was introduced. This mutation has been previously shown to suppress the cleavage of hsa-mir-16-1 by the Microprocessor complex. 44,73 The primers for the desired point mutations were designed using the QuikChange Primer Design program from Agilent Technologies and purchased from Thermo Fisher Scientific. The mutageneses were performed using the QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent Technologies) according to the manufacturer's instructions. The success of each mutagenesis was confirmed by commercial Sanger sequencing using the EGFP-Cfor primer (Microsynth Seqlab).
The different miRNA precursor sequences with the respective alleles were cloned into the luciferase reporter vector at the 3' untranslated region (UTR) of the GLuc gene. This was done by conventional restriction digestion and ligation. SfaAI and either EcoRI or XhoI (Thermo Fisher Scientific) were used as restriction enzymes. The precursor miRNA fragments were prepared with the NucleoSpin Gel and PCR Clean-up Kit (Macherey-Nagel) and ligated into the restriction sites of the luciferase reporter vector with T4 DNA ligase (Thermo Fisher Scientific). Successful cloning was confirmed by gel electrophoresis or DNA sequencing (Microsynth Seqlab). The cloned constructs were then amplified in E. coli and purified using the Nucleobond Xtra Midi EF Kit (Macherey-Nagel).
HeLa cells were transiently transfected with the reporter constructs in 3 biological replicates using the FuGENE HD Transfection Reagent (Promega). The HeLa cells were originally obtained from the German Cancer Research Center (Heidelberg, Germany). They were regularly tested for mycoplasma contamination using the MycoAlert Mycoplasma Detection Kit (Lonza). For establishing the assay with the hsa-mir-16-1 constructs, four different amounts of plasmid DNA (100 ng, 200 ng, 300 ng and 500 ng) were used, and supernatants were collected at 24 h, 48 h and 72 h. Otherwise, 200 ng and 300 ng of plasmid DNA were used, and the transfected HeLa cells were cultured for 24 h and 48 h. In this time, the stem-loop structure that is formed by the GLuc-precursor miRNA hybrid transcript can be recognised by the endogenous Microprocessor complex. A more efficient cleavage of the miRNA stemloop leads to a preferential degradation of the luciferase transcript and thus reduced reporter protein levels. The GLuc and SEAP activities were measured in technical duplicates using the Secrete-Pair Dual Luminescence Assay Kit (GeneCopoeia) and a GloMax-Multi Detection System (Promega) according to the manufacturers' recommendations.
Afterwards, average ratios of the luminescence catalysed by GLuc and SEAP were calculated. A higher GLuc/SEAP ratio indicates a lower Drosha cleavage activity. The ratios were further normalised to those resulting for the negative control, that is, the original luciferase reporter vector without an inserted miRNA precursor fragment. The decrease in relative luminescence as compared to the negative control is a measure of the pri-miRNA processing efficiency. 70 In this calculation, the processing efficiency for the negative control vector is set to zero.

Target gene analysis
We aimed to identify direct target genes of those miR-NAs that were found to be processed dependent on the genotype of MS-associated SNPs. For this purpose, we conducted a transcriptome profiling analysis of HeLa cell cultures ( Figure 1). The cells were transiently transfected with the respective miExpress precursor miRNA expression plasmids (GeneCopoeia) or the scrambled control clone for the pEZX-MR04 vector (CmiR0001-MR04, GeneCopoeia) as negative control using the FuGENE HD Transfection Reagent (Promega). Total RNA was isolated 24 h (from 2 replicates) and 48 h (from 4 replicates) following transfection using the miRNeasy Mini Kit (Qiagen). RNA concentrations were measured using a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific), and RNA integrity was assessed with an Agilent 2100 Bioanalyzer using RNA 6000 Nano kits (Agilent Technologies).
To confirm the overexpression of mature miRNA molecules, a qPCR analysis was performed with predesigned TaqMan single-tube assays from Thermo Fisher Scientific. The cDNA was generated from 10 ng intact RNA with miRNA-specific stem-loop primers and mixed with TaqMan Universal Master Mix II with UNG (Applied Biosystems). The qPCR measurement was then run in triplicates for 45 cycles on a ViiA 7 Real-Time PCR System (Applied Biosystems). As reference miRNA, hsa-miR-191-5p was chosen. 65 The Expression-Suite software version 1.3 (Thermo Fisher Scientific) was used to determine C T values based on the FAM reporter signals. The average C T values per triplicate were normalised to those of the reference miRNA to obtain DC T values.
The same RNA samples were used for the transcriptome analysis. From each sample, 210 ng of total RNA was used as starting material to generate amplified, fragmented and biotinylated single-stranded sense strand DNA using the GeneChip WT PLUS Reagent Kit (Applied Biosystems). The hybridisation on highresolution Clariom D arrays for human (Applied Biosystems) was then carried out for 16 h at 45°C in a GeneChip Hybridization Oven 645 (Affymetrix) following the manufacturer's protocol. After washing and staining in a GeneChip Fluidics Station 450 (Affymetrix), the microarrays were scanned using a GeneChip Scanner 3000 7G (Affymetrix). The scans were imported to the Affymetrix GeneChip Command Console version 4.0 to extract the signal intensities for the >6.7 million 25mer oligonucleotide probes per array. The Transcriptome Analysis Console (TAC) version 4.0.2 (Applied Biosystems) was used for initial quality control assessments and for processing the data with the signal space transformation robust multi-array average algorithm for background adjustment, quantile normalisation, probe set summarisation and log2 transformation.
Based on the processed data, an analysis of differential gene expression was performed in TAC. Potential target genes of the miRNAs were identified by filtering those transcript clusters (TC) that were significantly lower expressed at both time points (i.e., at 24 h and at 48 h), with at least 33.3% lower levels at one time point, in HeLa cells transfected with the precursor miRNA plasmids compared to HeLa cells transfected with the negative control plasmid. A nominal significance level of a=5% was chosen. The filtering result was then further confined to protein-coding genes with an official gene symbol according to the HGNC database. 74 Finally, to obtain the most likely direct target genes, we used RNAhybrid version 2.2 75 for checking whether there are predicted binding sites for the canonical miR-NAs or isomiRs in the 3' UTRs of the mRNAs. The 3' UTR sequences were retrieved with the BioMart tool from Ensembl release 101, 76 and we requested an mfe of less than À16 kcal/mol for the miRNA-target interaction as well as a perfect match with the seed region of the miRNA (i.e., position 2 to 7).
We evaluated whether the genes were already listed as predicted or experimentally determined target genes in the miRWalk 3.0 database 77 and in miRTarBase release 8.0, 78 respectively. The search in miRWalk was restricted to 3' UTR miRNA binding sites. Moreover, to assign the putative target genes to functional categories, we searched them in signalling pathways using Reactome release 75, 79 in Gene Ontology sets using Enrichr 80 and in the UniProt Knowledgebase. 81 The categories with the greatest gene set overlap were recorded. We also studied whether or not the genes are expressed in cell types of the blood and brain using the microarray data from Novershtern et al. 82 (GEO accession GSE24759) and the RNA-sequencing data from Darmanis et al. 83 and by demanding a transcript level of >100.

Statistical analyses
The statistical analyses were performed in R version 3.6.2 unless stated otherwise. For descriptive statistics, means and standard deviations (SD) per group were calculated. The inferential statistics were exploratory in nature. Therefore, the significance level was generally set at a=0.05 to minimise the risk of type II errors. The data were visualised in beeswarm/violin plots and bar plots using the R packages beeswarm, vioplot and gplots, respectively.
We tested for differences in B-cell subpopulation frequencies and miRNA expression levels (in linear scale) between the study groups using linear models while adjusting for age and sex as potential confounding variables. Type II F-tests were calculated for all main effects in the models with the R package car. 84 Pairwise comparisons between the study groups were performed with Tukey post hoc tests adjusted for age and sex using the R package multcomp. 85 We used Pearson correlation tests to compare the levels of canonical miRNAs and their potential isomiRs.
The SNP genotype distributions were compared between the MS patients and healthy controls using one-tailed Fisher's exact tests in order to evaluate the association of risk alleles with MS in our study population. Additionally, allelic ORs were calculated using the Wald method. The effect of each MIR SNP on the expression level of the respective mature miRNA was examined in cis-miR-eQTL analyses. For this purpose, we fitted linear mixed-effects models using the R package lme4. 86 The study group was treated as a random effect. Genotype, age and sex were considered as fixed effects. The models were fitted without interaction term and with the genotype effect being captured in an additive fashion using the restricted maximum likelihood approach. Significance values were then computed with type II Wald x 2 tests using the R package car. 84 For pairwise genotype comparisons, Tukey post hoc tests adjusted for age, sex and group were carried out.
The balanced dataset from the luciferase reporter assays was analysed using an additive 3-way analysis of variance (ANOVA), with amount of plasmid DNA, time point and genotype as independent variables and GLuc/SEAP ratios or the decrease in relative luminescence as dependent variable. Accordingly, F-tests were used to examine the significance of individual effects. Welch two-tailed t-tests were applied for comparing the data that were obtained under the same experimental conditions with plasmids, which represent the two alleles of a particular SNP.

Ethics
This study was conducted with approvals by the local ethics committee of the University of Rostock (permit numbers A 2014-0112 and A 2016-0188) and in compliance with the principles of the Declaration of Helsinki. All subjects gave prior written informed consent for the scientific use of their biosamples.

Role of funders
The funders had no role in the study design, data collection, analysis and interpretation, decision to publish or preparation of the manuscript.

Prioritised microRNAs from genetic risk loci for MS
We identified a total of 11 pri-miRNAs from MS-associated genetic regions that were considered for the further investigations in the present study (Table 1). These pri-miRNAs contain SNPs in the adjacent 50 bp around the miRNA precursor sequence that were suspected to influence the Drosha-mediated enzymatic cleavage of the stem-loop. In case of hsa-mir-4304, two such MIR SNPs were found. Seven of the 12 ascertained MIR SNPs are in complete LD (D'=1) with the nearby MS lead SNP from the GWAS, 26 which means that one MIR SNP allele is always inherited together with one specific MS lead SNP allele according to the 1000 Genomes haplotype data. 52 Six of the 12 MIR SNPs are rare genetic variants with a global MAF of less than 2% (Table 1).
Except for hsa-mir-26a-2 and hsa-mir-199a-1, all selected pri-miRNAs are primate-specific, 36 and all were classified as Drosha-dependent in the study by Kim et al. 87 Six of the miRNA stem-loops are presumably processed from introns of host gene transcripts (Table 1). Our research group has previously explored the role of hsa-mir-548ac, which is encoded in the first intron of the CD58 gene, and we found that MS risk allele carriers have increased levels of the mature miRNA in bloodderived cells. 48,49 Another prioritised pri-miRNA is hsamir-934, which is located in the fourth intron of the VGLL1 gene, the only sex chromosomal MS risk locus that has been reported in the GWAS from 2019. 26 The majority of the selected miRNAs (n=7) have a 4-digit number. This means that they were discovered relatively recently, that they are usually not covered by standard qPCR-or array-based miRNA measurement platforms, and that their regulatory functions are often still largely unclear.
Three of the MIR SNPs affect the basal stem (Table 1). For instance, the A allele of SNP rs73558572 causes a mismatch in the hsa-mir-934 secondary structure. All the other MIR SNPs are located in the flanking segments. Three of them belong to a downstream CNNC motif (rs817478, rs1005039 and rs41292017). We hypothesised that the pri-miRNA processing   efficiency and/or the Drosha cleavage sites are dependent on the allele status of these SNPs.

Characteristics of the study population and samples
The study population comprised 91 subjects in total. From these, 120 blood samples were collected and considered eligible for the analyses in the present study.
The study population was divided into 6 study groups: healthy controls (n=28), patients with PPMS who were treated with glucocorticosteroid pulse therapy (n=13) and four groups of RRMS patients (n=50) that were differentiated by the type of treatment received (Table 2). A subset of 29 RRMS patients was treated with natalizumab. Seven RRMS patients were treated with fingolimod (n=3), glatiramer acetate (n=3) or interferon-b-1b (n=1) before switching to an IRT (RRMS before IRT group). These patients received a first treatment course of intravenous alemtuzumab (n=4) or oral cladribine (n=3) immediately after the blood collection, and they all donated at least one additional blood sample at a later time point during therapy. The samples from patients under therapy with alemtuzumab (n=37) or cladribine (n=6) were obtained at least 4 months after the last treatment week, i.e., after partial B-cell repopulation following B-cell depletion. In the alemtuzumab group, one to four blood samples were collected per patient.
There was a female preponderance in the healthy control group and in the RRMS groups but not in the PPMS group ( Table 2). The average age of the controls was 28.0 years. The patients were older, with a mean age of 39.4 years and a mean disease duration of 8.3 years. They had, on average, 0.3 relapses in the year prior to the blood sampling and a mean EDSS score of 3.0. The patients with PPMS had the highest age and the highest degree of disability on average. The patients who were about to switch to an IRT were in an early phase of the disease, and they had the highest relapse rate. To account for these differences, we controlled for potential confounding by age, sex and group assignment in the statistical analyses of the data described hereafter.
The number of cells after enrichment of B cells averaged 4.3 million. Flow cytometry of the freshly isolated cells revealed a mean B-cell purity of 85.2% (2 out of 120 samples had missing values). Using the cryopreserved B cells, we were able to distinguish 8 B-cell subpopulations based on the expression of cell surface markers. The mean percentages of the different B-cell subsets across the 6 study groups are shown in Table  S1. In general, CD27 À IgD + naive B cells accounted for the largest proportion (up to 72.4% on average), whereas transitional and CD21 À/low CD38 À/low B cells occurred at the lowest frequencies (up to 7.4% and 9.0% on average, respectively). When comparing the 6 study groups, there were significant differences in the proportions of all subpopulations (F-test P<0.05). For instance, CD27 + IgD + non-switched memory B cells represented »24% of CD19 + cells in the healthy controls. The respective percentages were higher, on average, in RRMS patients treated with natalizumab (43.0%) and much lower in RRMS patients on IRT with alemtuzumab or cladribine (up to 8.9% on average). Age and sex did not explain the variance in these data (F-test P>0.05).

Differential expression of mature microRNAs in B cells
According to miRBase, 36 14 mature miRNAs are produced from the prioritised pri-miRNAs from MS-associated genetic loci (Table 1). Standard TaqMan assays were available for 13 of these. The qPCR assay design failed for hsa-miR-4492-3p due to a high GC content of 94%. To consider the possible formation of different mature miRNAs from the same stem-loop, we utilised information from IsomiR Bank 54 and miRCarta 55   A total of 120 blood samples from 91 subjects were analysed in this study. Up to four blood samples were collected per patient. From the 7 patients that provided a blood sample immediately before the initiation of IRT, at least one sample was also obtained after the administration of alemtuzumab (n=4) or cladribine (n=3 define potential 3' isomiRs (n=7), 5' and 3' isomiRs (n=2) and putative miRNAs that may be generated from the other arm than annotated in miRBase 36 (n=6). Thus, a total of 28 assays were used for the measurement of those miRNAs in B cells from MS patients and healthy controls. The data of 22 qPCR assays could be included in the further analyses. For each study group, the average expression levels relative to the reference miRNA hsa-miR-191-5p, which could be detected with C T values <20 in all samples, are given in Table 3. For the other 6 assays, we did not obtain valid data as the automatic threshold algorithm failed (THOLDFAIL flag): hsa-miR-934-5p|{isomiR}|14_33|, hsa-miR-3661-5p|{hsa-miR-3661}|20_41|, hsa-miR-3661-5p|{isomiR}|20_40|, hsa-miR-3671-5p|{isomiR}|3_19|, hsa-miR-3671-5p|{iso-miR}|3_19|sub.6.T>C and hsa-miR-4252-3p|{hsa-miR-4252}|34_52|.
Six miRNAs were detected in less than half of the samples: hsa-miR-548ac-5p and -3p, hsa-miR-934-5p and -3p, hsa-miR-3671-3p and hsa-miR-4464-3p. The highest expression levels (mean DC T values <10) were seen for hsa-miR-26a-5p and hsa-miR-199a-3p isoforms as well as for hsa-miR-4492-3p|{isomiR}|58_74|. In those cases where both the canonical miRNA and an isomiR from the same arm of the stem-loop were measured, the data were highly correlated (Pearson correlation P0.0005). This was also true for the two forms of hsa-miR-4464-5p that are shifted by 4 nt. Therefore, only one form was considered for the visualisation of the qPCR data. However, it should be noted that the levels of the 3' isomiR of hsa-miR-199a-5p were much higher than those of the corresponding canonical miRNA (mean DC T 11.58 vs. 20.94). Moreover, only one mature miRNA is recorded in miRBase 36 for hsa-mir-548ac, hsa-mir-934, hsa-mir-4304 and hsa-mir-4464, but we could measure miRNA molecules derived from both the 5' arm and the 3' arm.
In comparison of the study groups, significant expression differences were found for 7 canonical miR-NAs and 6 isomiRs (F-test P<0.05) ( Table 3). These differences were also significant in the sensitivity analysis (i.e., after replacing C T values >35 by a value of 35). The differentially expressed mature miRNAs originate from 6 pri-miRNAs: hsa-mir-26a-2, hsa-mir-199a-1, hsa-mir-4304, hsa-mir-4423, hsa-mir-4464 and hsa-mir-4492 ( Figure 2). Compared with the healthy controls, the levels of hsa-miR-26a-2-3p were significantly lower in RRMS patients treated with alemtuzumab (Tukey's test P=0.048). In contrast, the levels of hsa-miR-199a-5p and -3p were higher in alemtuzumab-treated cases than in the controls. Significantly higher levels of hsa-miR-4304-5p were measured in RRMS patients receiving cladribine in comparison to the healthy group and all other RRMS groups (Tukey's test P0.003). In the B-cell samples of the cladribine-treated patient group, we also found, on average, the highest levels of hsa-miR-4464-5p and a 1 nt shifted isomiR of hsa-miR-4492-3p. The average expression of hsa-miR-4423-3p|{isomiR}| 48_67| was highest in patients with PPMS.

Association of genetic variants with microRNA expression
The 12 MIR SNPs as well as two HLA allele tagging SNPs were genotyped for all subjects (Table S2). Overall, the allele frequencies in the study population roughly corresponded to the global allele frequencies ( Table 1). We could confirm a significant association between the risk allele of SNP rs1414273 (OR=2.52) as well as the HLA-DRB1*15:01 allele (OR=2.69) and MS (Fisher's test P<0.05). However, for 4 MIR SNPs, we did not detect the rare allele in either MS patients or controls.
The cis-miR-eQTL analysis was conducted for hsamir-26a-2, hsa-mir-199a-1, hsa-mir-4304, hsa-mir-4423 and hsa-mir-4492. For these pri-miRNAs, we were able to measure corresponding mature miRNAs in at least half of the samples, on the one hand, and to detect both alleles of the MIR SNPs, on the other hand. For the genotype of SNP rs817478, we found a significant association with the levels of hsa-miR-4423-5p in B cells (Wald test P<0.05) ( Table 4). The levels of the canonical miRNA and the respective 3' isomiR were found to be significantly lower in individuals that are homozygous for the risk allele A (Tukey's test P0.008). A similar trend was seen for hsa-miR-4423-3p ( Figure 3). We also observed higher levels of hsa-miR-199a-5p and -3p in carriers of the risk allele T of SNP rs1005039 (Figure 3). These trends did not reach statistical significance (Wald test P=0.702 and P=0.158, respectively) (Table 4), with the analysis being hampered by the low MAF. For hsamir-26a-2, hsa-mir-4304 and hsa-mir-4492, we did not find evidence of a genotype-dependent processing. The results were essentially the same when using the truncated data in the sensitivity analysis.

Verification of allele-specific primary microRNA processing efficiencies
To confirm that MS-associated genetic variants in pri-miRNA processing sites causally affect the Droshamediated stem-loop cleavage, a luciferase reporter system was utilised. The assay was first established with the pri-miRNA hsa-mir-16-1. The wild-type hsa-mir-16-1 was efficiently processed across the entire range of experimental conditions, as reflected by much lower relative luminescence signals in comparison to the negative control ( Figure S2). In contrast, the cleavage activity was completely lost when disrupting the CNNC motif in the flanking region by introducing a C>T mutation at position 3' (+19) (ANOVA P=6.7£10 À29 ).
To further substantiate the results of the cis-miR-eQTL analysis, we then tested whether the SNPs    rs1005039 and rs817478 directly implicate an altered processing of hsa-mir-199a-1 and hsa-mir-4423, respectively. The rare MS risk allele of SNP rs1005039 leads to an A in the second of two adjacent CNNC motifs in the transcript sequence (Figure 4a). In general, a stronger decrease in relative luminescence was observed when The expression of 10 mature miRNAs and isomiRs is visualised for the six study groups. These miRNAs could be detected in >50% of the samples and are derived from 6 primary miRNA transcripts from MS-associated genetic regions. For this plot, we always selected the 5p and 3p miRNA forms (canonical miRNA or isomiR) that had the lowest P-value in the group comparisons ( Table 3). The expression was quantified relative to the reference miRNA hsa-miR-191-5p. Higher data points indicate higher expression levels. The y-axis on the left displays DC T values in an inverted manner and the y-axis on the right displays the data converted in linear scale (2 ÀDCT £1000). Black horizontal lines indicate the means per group. Significance values <0.05 from pairwise Tukey post hoc tests are shown above the brackets. IRT=immune reconstitution therapy, miRNA=microRNA, MS=multiple sclerosis, n=number, PPMS=primary progressive multiple sclerosis, RRMS=relapsing-remitting multiple sclerosis.

MIR SNP
Allele frequency RA MicroRNA assay a 0 RA, mean § SD (mv b ) 1 RA, mean § SD (mv b ) 2 RA, mean § SD (mv b ) P-value c  Table 4: Results of the analysis to detect cis-miR-eQTL effects in B cells.
The expression of mature miRNA molecules in B cells from the peripheral blood was measured by stem-loop qPCR assays. The cis-miR-eQTL analysis included only miRNAs that could be detected in at least half of the samples (n=120) and only MS-associated SNPs in miRNA-coding regions (MIR SNPs) for which both alleles were present in the data.  HeLa cells were transfected with the plasmid carrying the risk allele than when HeLa cells were transfected with the plasmid carrying the other allele (Figure 4b).
Although this difference was not always statistically significant at individual experimental conditions in t-tests, the overall genotype effect was significant when analysing the whole dataset in a 3-way ANOVA (P=0.006). This points to a more efficient processing of the hsamir-199a-1 stem-loop in the presence of the risk allele of SNP rs1005039. The SNP rs817478 denotes an A/C polymorphism. The C allele creates a CNNC motif downstream of hsamir-4423, whereas the motif is missing when the more common MS risk allele A is present. In the luciferase reporter assay, a strong decrease in relative luminescence (>75%) was generally seen after transfection of HeLa cells with GLuc-hsa-mir-4423 constructs. However, a significantly lower decrease was consistently observed when the construct carrying the risk allele was used (t-test P0.019 and ANOVA P=1.3£10 À13 ). This clearly indicates that the MS risk allele of SNP rs817478 is associated with a much lower efficiency of hsa-mir-4423 processing by Drosha (Figure 4c).
We also assessed whether the SNPs rs41292017, rs78351440 and rs7926599 affect the cleavage of hsamir-26a-2, hsa-mir-4304 and hsa-mir-4492, respectively. However, in line with the results of the cis-miR-eQTL analysis (Table 4), there was no evidence of a genotype-dependent processing of these miRNA stemloops in the reporter assays (3-way ANOVA P>0.125). The B-cell expression levels of 4 miRBase-annotated mature miRNAs are visualised. (a and b) A non-significant tendency towards higher expression of hsa-miR-199a-5p and -3p was noted in carriers of the MS risk allele (RA) of SNP rs1005039. (c and d) The RA of SNP rs817478 was associated with a lower expression of hsa-miR-4423-5p in the cis-miR-eQTL analysis (Table 4). Accordingly, the respective SNPs were suspected to affect the miRNA stem-loop processing by Drosha. The number of samples from subjects carrying 0, 1 or 2 RA is indicated below each beeswarm/violin plot. The left and right y-axes refer to the raw qPCR data (DC T values displayed in an inverted manner) and the converted data (2 ÀDCT £1,000), respectively. The means per genotype group are shown as black lines. The Tukey test P-value reaching the significance level of a=0.05 is indicated above the bracket. eQTL=expression quantitative trait locus, miRNA=microRNA, MS=multiple sclerosis, n=number, qPCR=quantitative polymerase chain reaction, SNP=singlenucleotide polymorphism.

Target genes of hsa-mir-199a-1 and hsa-mir-4423
The cis-miR-eQTL analyses and the luciferase reporter assays suggested that MS-associated SNPs modulate the processing of hsa-mir-199a-1 and hsa-mir-4423. To identify their target genes, we transfected HeLa cells either with the respective precursor miRNA expression plasmids, which did not carry the MS risk allele but the other allele, or with the scrambled control plasmid. The qPCR analysis revealed that the endogenous expression of the mature miRNAs in HeLa cells was low or absent. In contrast, a massive overexpression of the miRNAs was detected 24 h and 48 h after transfection with the precursor miRNA plasmids (Table S3).
The subsequent microarray analysis disclosed the expression levels of 135,750 different TCs. The transcriptome data were of high quality and showed a similar distribution across all 18 samples ( Figure S3). Compared to the negative controls, 152 TCs were expressed at significantly lower levels when hsa-mir-199a-1 was overexpressed (P<0.05 and fold-change <À1.5). Of those, 60 transcripts were found to encode proteins and to contain at least one predicted binding site for the respective 5p or 3p mature miRNAs in their 3' UTRs (Table 5). For HeLa cells that were transfected with the hsa-mir-4423 plasmid, we filtered 62 TCs with lower expression (P<0.05 and fold-change <À1.5). However, of those, (a) The predicted secondary structures of the miRNA stem-loops were visualised using the web-based tool forna. 58 Both SNPs are located in a CNNC motif in the 3' flanking region. The MS risk allele (RA) is marked in red. The Drosha cleavage sites are indicated according to the miRBase annotation. 36 (b and c) The processing of hsa-mir-199a-1 and hsa-mir-4423 was analysed using a luciferase-based assay. For this purpose, the miRNA precursor sequences were cloned into the 3' UTR of the GLuc gene within a reporter vector. GLuc/SEAP ratios were then measured after transient transfection of HeLa cells with plasmids carrying either the MS risk allele or the alternative allele. The decrease in relative luminescence was calculated by subtracting from 1 the quotient of the GLuc/ SEAP ratio and the respective ratio that was obtained for the negative control. Higher bars thus indicate higher processing efficiencies. The bars and error bars show the means and standard deviations of 3 biological replicates each. Welch t-test P-values are given above the bars. The MS RA of the SNP rs1005039 conferred a higher hsa-mir-199a-1 cleavage efficiency (b), whereas the RA of SNP rs817478 was associated with a significantly diminished processing of hsa-mir-4423, independently of the amount of plasmid DNA used for transfection and the readout time point (c). ANOVA=analysis of variance (3-way additive model), GLuc=Gaussia luciferase, miRNA=microRNA, MS=multiple sclerosis, SEAP=secreted alkaline phosphatase, SNP=single-nucleotide polymorphism, UTR=untranslated region.
TMEM47 was the only protein-coding transcript with sequence complementarity in the 3' UTR to the corresponding mature miRNAs. The mRNA expression foldchanges and P-values as well as the estimated minimum hybridisation energies are provided for all prioritised target genes in Table S4. Information on their expression in cell types of the blood and brain is given in Table S5. The transcriptome data are available in the GEO database (accession number GSE185859).
A subset of 26 of the 60 target genes that we identified for hsa-mir-199a-1 were also predicted as putative targets in the miRWalk database, 77 and NACC1 and TMOD2 are already listed as experimentally verified targets in the miRTarBase database. 78 The target genes of hsa-mir-199a-1 were found to participate in diverse biological functions, including immune system signalling, metabolism and RNA binding. TMEM47, the prioritised target gene of hsa-mir-4423, is a transmembrane protein that regulates the organisation of cellular junctions. Interactions of hsa-miR-4423-5p and -3p with the 3' UTR of TMEM47 were also predicted by miRWalk 77 (Tables 5 and S6).

Discussion
Until now, about 200 genetic loci have been linked to the inherited risk of developing MS, 26 but our functional understanding of the underlying molecular mechanisms is still scarce. Here, we hypothesised that pri-miRNA processing sites contain causal variants for this disease. We combined expression analyses of B cells, experiments in cell culture and in silico evaluations to shed light on the maturation and function of largely underexplored and primate-specific miRNAs. To our knowledge, at least the prioritised miRNAs with 4-digit numbers have never been explicitly measured in B cells from MS patients before. We found that the Drosha-mediated cleavage of hsa-mir-199a-1 and, in particular, hsa-mir-4423 is influenced by proximal MS risk variants. We have also determined potential direct target genes of these miRNAs.
We first examined the association between the genotype of MS-associated SNPs and the levels of mature miRNAs and isomiRs in B cells. This was done in a representative study population, with the exception that we did not include patients with SPMS. Otherwise, the patient cohort reflected typical characteristics of MS patients in terms of age, sex and disease status, resembling those of patients included in European MS registries. 88 With flow cytometry, we confirmed a strong enrichment of B cells in the samples, while the percentages of the 8 B-cell subpopulations were within the expected ranges: The distribution of B-cell subsets was quite similar when comparing healthy subjects with the group of RRMS patients before IRT. Patients receiving natalizumab infusions had a lower proportion of naive B cells but a higher proportion of memory B cells in the blood, as previously described. 89,90 In contrast, patients who were treated with oral cladribine or intravenous alemtuzumab showed a relative increase in the frequency of transitional and naive B cells and a decrease in the frequency of memory B cells. This is in line with earlier studies that have shown that B-cell repopulation following infusion of alemtuzumab or (subcutaneous) administration of cladribine results in such shifts in Bcell subpopulations. 15,16,91,92 However, the effects of these IRTs for MS on miRNA expression have not been investigated so far.
Compared to the healthy controls, we observed a significant differential expression of hsa-miR-26a-5p and -3p in RRMS patients treated with natalizumab and alemtuzumab, respectively. In previous studies, altered levels of hsa-miR-26a-5p were detected in serum, 93,94 whole blood, 95   We obtained transcriptome profiles of HeLa cells in which the respective miRNA precursor transcripts were overexpressed. Listed are the potential direct target genes identified by this screening that encode proteins and for which a hybridisation of their 3 0 UTR sequences with the mature miRNA sequences could be computationally verified. The online tools Reactome 79 and Enrichr 80 were used to analyse the involvement of the target genes of hsa-mir-199a-1 in biological processes. See Table S4 and Table S6 for further information. miRNA=microRNA, N=number, UTR=untranslated region.
patients with MS. However, inconsistent results have been described, most likely due to different study designs. For instance, in response to interferon-b therapy, higher levels of this miRNA were reported in thrombocytes, 99 but lower levels were reported in serum exosomes. 94 Our study differs from the existent studies in that we explicitly analysed B cells, that we also measured hsa-miR-26a-3p levels and that we included patients who received IRTs. The pri-miRNA hsa-mir-26a-2 is encoded on chromosome 12 and belongs to the mir-26 family, which is conserved in vertebrates. 36 Of note, hsa-miR-26a-5p is also produced from the hsa-mir-26a-1 locus on chromosome 3, which weakened the validity of the cis-miR-eQTL analysis.
There is accumulating evidence that miR-26a miRNAs play pivotal roles in cellular differentiation, cell growth and apoptosis and that they are involved in the development of various human diseases. 100 In experimental autoimmune encephalomyelitis (EAE), an animal model of MS, silencing of miR-26a-5p was found to result in increased expression of T h 17-related cytokines and increased disease severity, while overexpression of the precursor miRNA was found to result in reduced expression of T h 17-related cytokines and a milder form of EAE. 95 A direct target of miR-26a-5p is IL6 mRNA. 95 We measured higher levels of hsa-miR-199a-5p and -3p in B cells from alemtuzumab-treated patients than in those from healthy controls. The literature on the expression of these miRNAs in MS patients is heterogeneous, with previous studies performed with sera, 101 PBMC, 102 CD4 + T cells, 103 cerebrospinal fluid samples 104 and brain lesion tissues. 105 Remarkably, the levels of hsa-miR-199a-5p were described to be »3 times higher in CD4 + T cells of RRMS patients in remission compared to healthy subjects 103 and in MS lesions compared to normal brain white matter. 105 The mir-199 family is evolutionarily highly conserved, and it should be noted that the -5p and -3p mature miRNA molecules are produced from two loci (hsa-mir-199a-1 on chromosome 19 and hsa-mir-199a-2 on chromosome 1). 36 Functional studies have demonstrated that especially the -3p mature form may act as either a tumour suppressor or an oncogene, depending on the type of cancer. 106 The miR-199a miRNAs were shown to regulate tumour cell proliferation, invasion, apoptosis and glucose metabolism, to protect against hypoxia-induced damage in cardiomyocytes, to modulate stem cell differentiation and fetal development and to affect adipocyte differentiation and fatty acid composition. 106À108 In our study, only 5 B-cell samples were heterozygous for the MS-associated allele T of SNP rs1005039 that is located in one of two CNNC motifs flanking hsamir-199a-1. The low MAF possibly accounts for the fact that the suspected cis-miR-eQTL effect could not be detected with statistical significance. However, the results of the luciferase reporter assays indicated a higher pri-miRNA processing efficiency in HeLa cells when the MS risk allele was present. Likewise, Roden et al. have demonstrated that an exchange of the NN bases within CNNC can have a mild effect on the stemloop processing. 47 Moreover, a previous study reported that the C allele of SNP rs3760781, which has a distance of »19 kb to the pri-miRNA locus, is associated with significantly higher levels of hsa-miR-199a-3p and -5p in plasma. 109 The worldwide frequency of the haplotype TCC of the MIR SNP (rs1005039), the SNP from 109 (rs3760781) and the MS lead SNP (rs12609500) is 1.3% according to the 1000 Genomes data. 51,52 In such a context, a small sample size, as in our study, implies high uncertainty in the inferences that can be drawn. 110 Hence, to further substantiate the presumed causal relationship between genetic risk, miRNA expression and MS, a larger study population and/or a family-based approach is needed. A number of target genes of miR-199a miRNAs have been already experimentally determined in previous studies, as recorded in the miRTar-Base database. 78 In the present study, 60 proteincoding target genes were prioritised for hsa-mir-199a-1, some of which are novel. For instance, the immunerelated genes IRAK2, MIF and VIM were not contained as known or predicted targets in the databases miRTar-Base 78 and miRWalk, 77 respectively. These and other target gene candidates that were identified by our transcriptome screening await further confirmation in future studies.
The MS risk allele A of SNP rs817478 was associated with significantly lower levels of hsa-miR-4423-5p in B cells in our cis-miR-eQTL analysis. The same trend was seen for hsa-miR-4423-3p. In the reporter assay experiment, an allele-dependent cleavage rate of hsa-mir-4423 was clearly confirmed across all experimental conditions. These data thus suggest a causal role of miR-4423 miRNAs in the genetic predisposition to MS. The association of the rs817478 genotype with hsa-miR-4423-5p expression was previously identified as one of 57 cis-miR-eQTLs in the European subset of the Geuvadis RNA-sequencing data on EBV-transformed lymphoblastoid cell lines. 33 The eQTL is also contained in the ncRNA-eQTL database, which is solely based on cancerrelated studies. 111 The genetic effect is explained by the fact that there is a 3 0 flanking CNNC motif in the presence of the minor allele C but not in the presence of the MS risk allele A. This sequence feature is well known to enhance the Drosha-mediated processing and is found downstream of one third of all human conserved miRNA stem-loops. 44,47 The CNNC motif is bound by SRSF3, which is a major auxiliary factor for pri-miRNA processing, 112 and both hsa-mir-199a-1 and hsa-mir-4423 were found to be productively processed by Drosha in the presence of SRSF3. 113 The pri-miRNA hsa-mir-4423 has recently evolved: It is currently only annotated in the genomes of humans and chimpanzees. 36 The mature miRNA from the 3 0 arm has been first identified in malignant B cells. 114 Perdomo et al. have shown that hsa-mir-4423 is a regulator of airway epithelial cell differentiation and a repressor of lung carcinogenesis. 115 Otherwise, its function is still largely unclear. The miR-TarBase database currently contains no miRNA-target interaction with strong evidence (i.e., with verification by qPCR, western blot or reporter assay) for hsa-mir-4423. 78 Our analysis revealed only TMEM47 as potential direct target gene of this miRNA. This gene was also found to be significantly lower expressed in SW900 cells overexpressing hsa-mir-4423 in the data from Perdomo et al. (GEO accession GSE48796). 115 TMEM47 is an adherens junction protein involved in regulating epithelial cell junction organisation. 116 It is also upregulated during astrocyte maturation, 117 but it remains to be clarified to which degree it plays a role in MS. Further investigations on the cell type-specific functions of this gene are warranted.
Several limitations of this study should be considered in the data interpretation. First, the patients in our study were heterogeneous with respect to the treatment they received. We accounted for this by including the group assignment in the cis-miR-eQTL analysis, while also adjusting for age and sex, but the possibility of unmeasured confounding cannot be ruled out. It also remains unclear whether other IRTs, such as ocrelizumab, affect the expression of the miRNAs. Second, the measurement of mature miRNA levels was conducted in B cells, and we restricted our analysis to miRNAs that could be detected in at least half of the samples. However, miRNAs from hsa-mir-3661 and hsa-mir-4252 could not be assayed properly, and other miRNAs were barely detectable, including hsa-miR-548ac-3p, for which we have previously shown a cis-miR-eQTL effect and which is preferentially expressed in CD8 + T cells and natural killer cells. 48,118 We therefore presumably missed eQTLs for miRNAs that are expressed predominantly in other cells as well as eQTLs that are cell typespecific, a phenomenon that has been studied more thoroughly for protein-coding genes. 119,120 Deeper insights on the impact of the genetic risk variants might be achieved by examining other peripheral immune cells, CNS-resident cells and/or specific B-cell subsets, e.g., memory B cells. Third, our study was limited to selected SNPs and miRNAs. We did not include rare variants with MAF<1% because the sample size would have been insufficient to study cis-miR-eQTL associations. We also excluded miRNAs that are produced independently of Drosha (e.g., mirtrons 39 ). Moreover, it is difficult to define possible isomiRs and to measure their expression accurately. Here, we employed stem-loop RT-qPCR assays. These assays allow to discriminate between closely related mature miRNAs, 121 but different 3' isomiR species may be cross-detected. 122 The exclusive measurement of miRNA isoforms that differ by only 1 nt at their termini is possible by dumbbell PCR with TaqMan probes, 123 but there are currently no commercial assays for this. Fourth, the efficiency of pri-miRNA processing was evaluated using reporter assays in which transiently expressed luciferase mRNA is cleaved by endogenous Microprocessor in HeLa cells. 69,70 As an alternative, recombinant Drosha and DGCR8 proteins can be used in kinetic assays to more precisely determine the allele-specific rates at which the cleavage occurs. 124 It is important to note, however, that such assays do not reflect all regulatory events taking place in vivo. In fact, various mechanisms coordinate the biogenesis and turnover of mature miRNAs (e.g., Dicer activity and miRNA decay 39 ). Further research is needed to elucidate how the interplay of genetic factors with environmental and lifestyle factors (e.g., EBV infection and vitamin D deficiency) may alter molecular networks related to miRNA maturation and function. 125,126 Fifth, a transcriptome-wide analysis was conducted to identify the most likely direct target genes of hsa-mir-199a-1 and hsa-mir-4423. However, we might have missed actual target genes, e.g., those that are not endogenously expressed in HeLa cells and those that are inhibited by translational repression rather than mRNA destabilisation. In future studies, improved experimental strategies for miRNA target identification may be applied to verify effects at the protein level and to better resolve the functional differences between miRNA family members and isoforms. 127 In conclusion, we here focused on miRNAs that are encoded in MS-associated genetic regions. We observed a significant differential expression of mature miRNAs and isomiRs from hsa-mir-26a-2 and hsa-mir-199a-1 in B cells when comparing subgroups of RRMS patients and healthy controls. Based on cis-miR-eQTL analyses and reporter assays, we found that the MS risk allele T of SNP rs1005039 is related to a somewhat more efficient Drosha-dependent processing of hsa-mir-199a-1, whereas the MS risk allele A of SNP rs817478 confers a much lower hsa-mir-4423 processing efficiency. These findings complement the existing body of literature on the dysregulation of miRNAs in MS and on sequence determinants of pri-miRNA processing. Our and other data indicate that diverse biological processes are regulated by hsa-mir-199a-1, while the role of hsa-mir-4423 remains poorly defined. For a deeper understanding of the pathomechanisms of MS, further functional studies are needed to identify the disease-causing genetic variants and to ascertain their effects on transcriptional activity and RNA processing.

Contributors
MH, BF and UKZ conceptualised the study. MH, NB, EP and UKZ secured the research funding. MS, AW, SM and AD coordinated the blood sampling and collected clinical information. NB, EP and BF processed the blood samples. NB executed the majority of the experiments. DK performed the microarray analysis. PL provided valuable suggestions for the reporter assays.
NB and MH analysed and interpreted the data and prepared the figures and tables. UKZ supervised the research and provided important intellectual content. MH drafted the original manuscript, and all authors reviewed, edited and prepared the final version. MH and NB had full access to all the data and take responsibility for the integrity and accuracy of the data analysis. EP, BF and DK have verified the underlying data. All authors have read and approved the final version of the manuscript.

Data sharing statement
The data supporting the findings of this study are available in the article and/or supplementary materials. The transcriptome data from the target gene analysis are publicly available from the GEO database (https://www. ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE185859). Readers are welcome to contact the corresponding author by e-mail to request access to the raw de-identified data used in this work.

Declaration of interests
MH received speaking fees and travel funds from Bayer HealthCare, Biogen, Merck, Novartis and Teva. AW received speaking fees and travel funds from Biogen, GlaxoSmithKline, Merck Serono, Novartis and Sanofi Genzyme. NB received travel funds from Novartis. UKZ received research support as well as speaking fees and travel funds from Alexion, Almirall, Bayer HealthCare, Biogen, Bristol Myers Squibb, Janssen, Merck Serono, Novartis, Roche, Sanofi Genzyme, Teva as well as EU, BMBF, BMWi and DFG. BF, EP, MS, SM, AD, DK and PL declare that they have no competing interests.